Maximum likelihood reconstruction of ancestral amino-acid sequences
نویسندگان
چکیده
Maximum-likelihood methods are used extensively in phylogenetic studies [3]. In particular, aminoacid sequences of ancestral species have been inferred using these methods [7]. Such ancestral reconstruction tasks aim at identifying either the most likely sequence in a specific ancestor species (marginal reconstruction), or the most likely set of ancestral states corresponding to all the ancestral taxa in a given phylogeny (joint reconstruction [6]). Joint reconstruction is motivated by studies of phenomena involving several independent lineages, like [8], and is implemented in [6]. However, existing algorithms for this task are exhaustive, and take exponential time. Furthermore, these algorithms assume a naive model of evolution, i.e., a constant substitution rate, whereas [5] shows that models incorporating rate variation among sites are statistically superior. In this work we: (a) Devise a dynamic programming algorithm for joint reconstruction. The complexity of this algorithm is linear in the number of sequences, but assumes no rate variation among sites.1 (b) Present a greedy heuristic for joint reconstruction assuming rate variation among sites. (c) Introduce a speed-up for calculating the replacement probabilities between any two states.
منابع مشابه
Assessing the Accuracy of Ancestral Protein Reconstruction Methods
The phylogenetic inference of ancestral protein sequences is a powerful technique for the study of molecular evolution, but any conclusions drawn from such studies are only as good as the accuracy of the reconstruction method. Every inference method leads to errors in the ancestral protein sequence, resulting in potentially misleading estimates of the ancestral protein's properties. To assess t...
متن کاملA fast algorithm for joint reconstruction of ancestral amino acid sequences.
A dynamic programming algorithm is developed for maximum-likelihood reconstruction of the set of all ancestral amino acid sequences in a phylogenetic tree. To date, exhaustive algorithms that find the most likely set of ancestral states (joint reconstruction) have running times that scale exponentially with the number of sequences and are thus limited to very few taxa. The time requirement of o...
متن کاملAncestral Nucleotide and Amino Acid Sequences
A statistical method was developed for reconstructing the nucleotide or amino acid sequences of extinct ancestors, given the phylogeny and sequences of the extant species. A model of nucleotide or amino acid substitution was employed to analyze data of the present-day sequences, and maximum likelihood estimates of parameters such as branch lengths were used to compare the posterior probabilitie...
متن کاملA Practical Algorithm for Estimation of the Maximum Likelihood Ancestral Reconstruction Error
The ancestral sequence reconstruction problem asks to predict the DNA or protein sequence of an ancestral species, given the sequences of extant species. Such reconstructions are fundamental to comparative genomics, as they provide information about extant genomes and the process of evolution that gave rise to them. Arguably the best method for ancestral reconstruction is maximum likelihood est...
متن کاملPAML: a program package for phylogenetic analysis by maximum likelihood
PAML, currently in version 1.2, is a package of programs for phylogenetic analyses of DNA and protein sequences using the method of maximum likelihood (ML). The programs can be used for (i) maximum likelihood estimation of evolutionary parameters such as branch lengths in a phylogenetic tree, the transition/transversion rate ratio, the shape parameter of the gamma distribution for variable evol...
متن کامل